Information Relaxations and Duality in Stochastic Dynamic Programs
نویسندگان
چکیده
We describe a general technique for determining upper bounds on maximal values (or lower bounds on minimal costs) in stochastic dynamic programs. In this approach, we relax the nonanticipativity constraints that require decisions to depend only on the information available at the time a decision is made and impose a “penalty” that punishes violations of nonanticipativity. In applications, the hope is that this relaxed version of the problem will be simpler to solve than the original dynamic program. The upper bounds provided by this dual approach complement lower bounds on values that may be found by simulating with heuristic policies. We describe the theory underlying this dual approach and establish weak duality, strong duality, and complementary slackness results that are analogous to the duality results of linear programming. We also study properties of good penalties. Finally, we demonstrate the use of this dual approach in an adaptive inventory control problem with an unknown and changing demand distribution and in valuing options with stochastic volatilities and interest rates. These are complex problems of significant practical interest that are quite difficult to solve to optimality. In these examples, our dual approach requires relatively little additional computation and leads to tight bounds on the optimal values.
منابع مشابه
Information Relaxations, Duality, and Convex Stochastic Dynamic Programs
We consider the information relaxation approach for calculating performance bounds for stochastic dynamic programs (DPs). This approach generates performance bounds by solving problems with relaxed nonanticipativity constraints and a penalty that punishes violations of these nonanticipativity constraints. In this paper, we study DPs that have a convex structure and consider gradient penalties t...
متن کاملApproximations to Stochastic Dynamic Programs via Information Relaxation Duality
In the analysis of complex stochastic dynamic programs (DPs), we often seek strong theoretical guarantees on the suboptimality of heuristic policies: a common technique for obtaining such guarantees is perfect information analysis. This approach provides bounds on the performance of an optimal policy by considering a decision maker who has access to the outcomes of all future uncertainties befo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Operations Research
دوره 58 شماره
صفحات -
تاریخ انتشار 2010